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ABSTRACT 

Alternative splicing (AS) coupled to nonsense- 
mediated decay (NMD) is a post-transcriptional 
mechanism for regulating gene expression. We 
have used a high-resolution AS RT-PCR panel to 
identify endogenous AS isoforms which increase 
in abundance when NMD is impaired in the 
Arabidopsis NMD factor mutants, upf1-5 and 
upf3-1. Of 270 AS genes (950 transcripts) on the 
panel, 102 transcripts from 97 genes (32%) were 
identified as NMD targets. Extrapolating from 
these data around 13% of intron-containing genes 
in the Arabidopsis genome are potentially regulated 
by AS/NMD. This cohort of naturally occurring 
NMD-sensitive AS transcripts also allowed the 
analysis of the signals for NMD in plants. We show 
the importance of AS in introns in 5' or 3 UTRs in 
modulating NMD-sensitivity of mRNA transcripts. 
In particular, we identified upstream open reading 
frames overlapping the main start codon as a new 
trigger for NMD in plants and determined that NMD 
is induced if 3 -UTRs were >350nt. Unexpectedly, 
although many intron retention transcripts possess 
NMD features, they are not sensitive to NMD. 
Finally, we have shown that AS/NMD regulates the 
abundance of transcripts of many genes important 
for plant development and adaptation including 
transcription factors, RNA processing factors and 
stress response genes. 



INTRODUCTION 

Alternative splicing (AS) is an important mechanism to 
control gene expression and increase the proteome com- 
plexity of higher eukaryotes (1-3). Regulated AS drives 
developmental pathways and responses to environmental 
pressures. Following transcription, splicing of the exons 
requires removal of introns by assembling a large RNP 
complex, the spliceosome, with five snRNPs and about 
180 proteins (4). Splice site selection has to be precise 
but consensus sequences defining splice sites are degener- 
ate and how a splice site is selected from many similar sites 
within a transcript remains a major question. In many 
cases, specific splice sites are used in all transcripts (con- 
stitutive splicing) while in alternative splicing, other splice 
sites are used to various extents giving rise to alternate 
transcripts with variable sequences. It is now well estab- 
lished that in addition to splice sites, sequence elements 
within exons and introns, termed either splicing enhancers 
or silencers are binding sites for splicing factors which 
either enhance or repress splicing depending on their 
activities (5,6). These splicing regulators are, for 
example, SR and hnRNP protein families, and other 
cell-, stage- or tissue-specific proteins involved in consti- 
tutive and alternative splicing which establish the splicing 
code and determine which splice site is selected (7-10). The 
regulation of alternative splicing is brought about by the 
relative levels of the RNA-binding proteins determining 
how efficiently different splice sites are used to generate 
more than one spliced mRNA from one gene. 

Alternatively spliced mRNA variants can produce func- 
tionally different protein isoforms with altered amino acid 
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sequences and protein domains resulting in changes in 
activity, localization, interaction partners or post- 
translational modifications (1,11). In addition, alternative 
splicing can regulate mRNA levels through the targeted 
degradation of specific AS isoforms by nonsense-mediated 
decay (NMD) (see below). In particular, alternative 
splicing can result in mRNAs with premature termination 
codons (PTCs) which could give rise to truncated proteins 
which are detrimental to cell survival and energy costly for 
the cell. RNA quality control mechanisms have evolved 
at all levels of gene expression to identify and remove 
aberrant RNA transcripts. One of the best investigated 
mRNA quality control mechanisms is NMD which 
degrades mRNAs which possess a premature termination 
codon (PTC+) and other physiological mRNAs without a 
PTC such as transcripts with long 3'-UTRs [reviewed in 
(12-18)]. Despite great advances in understanding of the 
NMD pathway, it is apparent that not every PTC triggers 
NMD and that this pathway controls the abundance of 
certain mRNAs which do not contain known NMD 
features, arguing that not all the factors inducing NMD 
have been identified yet. 

Several features of NMD-sensitive, PTC + transcripts 
have been elucidated and have led to models of how 
PTCs are recognized and degradation triggered. In the 
current model for mammals, NMD initiates the rapid 
decay of a transcript if translation termination is per- 
turbed [reviewed in (12-18)]. Efficient translation termin- 
ation of the ribosome is proposed to involve the 
interaction of the release factor, eRF3, and poly(A) 
binding proteins (PABP) on the poly(A) tail of the 
mRNA. If this interaction is prevented or impaired by, 
for example, an unusually long 3'-UTR, the eRF3 on 
the ribosome will bind UPF1 which then recruit UPF2 
and UPF3, all core NMD proteins. This functional 
NMD complex (which includes many other proteins) 
then elicits the phosphorylation of UPF1 and rapid deg- 
radation of the transcript. This 'long 3'UTR' mechanism 
is characteristic for transcripts in invertebrates and yeast. 
In mammals, the NMD response triggered by a ribosome 
terminating at a PTC is stimulated by UPF3 associated 
with a downstream exon-junction complex (EJC) which is 
deposited on the mRNA 20-25 nt upstream of a spliced 
exon-exon junction (19,20). In the course of splicing the 
EJC complex binds the NMD factors UPF2/UPF3 which 
can then associate with a ribosome terminating at a 
PTC upstream of the EJC which has recruited UPF1 in 
the SURF complex (SMGl-UPFl-eRFl-eRF3) (21). On a 
normal, non-PTC-containing mRNA, the EJC is removed 
in the first round of translation (22) except when the 
EJC is located in the 3'-UTR. This is consistent with the 
observation that introns in the 3'-UTR may significantly 
enhance NMD. Thus, in mammals, both the length of 
the 3'-UTR and the presence of an EJC complex down- 
stream of a PTC can trigger NMD. However, the recent 
demonstration in Drosophila that EJCs are not deposited 
at each splice junction such that only some introns are 
able to trigger intron-dependent NMD (23) suggests that 
NMD may rely more widely on long 3'UTR signals. 

The NMD pathway in plants is as yet not well 
characterized (24,25). Plants possess orthologues of the 



key eukaryotic NMD proteins, UPF1, UPF2, UPF3 and 
SMG-7 (but not SMG-1, SMG-5 or SMG-6) and these 
have been shown to be involved in degrading mRNAs 
with PTCs (26-31). Efforts to determine the rules for 
NMD substrates suggest that like mammals, plants are 
able to recognize different types of PTC-containing tran- 
scripts (Figure lc). Firstly, it was shown that both long 
3'-UTRs and introns located in 3'-UTRs are signals 
for efficient NMD (30,32-34). This indicates that like 
in invertebrates and yeast, the distance between a stop 
codon and the PABP on the poly(A) sequence is import- 
ant and that translation termination is also likely to 
require the interaction of the release factor-containing 
ribosome with PABP. Secondly, EJC components, which 
are required for the intron-based NMD mechanism 
proposed for mammalian NMD, have been demonstrated 
to be similarly important in plants (28). Furthermore, 
upstream open reading frames (uORFs) of more than 
35 amino acids can trigger NMD in plants (35). 

Alternative splicing in plants is an important regulatory 
process for plant development and for the response of 
plants to environmental factors. However, its frequency 
of occurrence has been grossly underestimated largely 
due to low depth of sequencing and relatively few avail- 
able ESTs. The most recent estimate based on next gener- 
ation sequencing is that about 42% of intron-containing 
genes undergo alternative splicing (36) and this is still 
likely to be a significant underestimate. In humans, over 
95% of genes undergo alternative splicing and more 
importantly around 20-30% of alternatively spliced 
transcripts contain PTCs and are potentially turned over 
by NMD (37^10). The importance of the link between AS 
and NMD has been highlighted by the regulation of func- 
tional transcript levels of key splicing factors such as 
SR proteins and PTB through alternative splicing via 
conserved splice sites (41,42). Conservation of alternative 
splice sites to produce PTC-containing transcripts has also 
been demonstrated for SR protein genes in lower and 
higher plants (43) and more generally, NMD seems to 
play a regulatory role in gene expression using alternative 
spliced transcripts (27). The best characterized examples 
of gene regulation by AS/NMD in plants are GRP7 
and GRP8, components of a slave oscillator coupled to 
the circadian clock. These glycine-rich RNA-binding 
proteins bind their own pre-mRNAs inducing AS which 
produces NMD-sensitive transcripts thereby auto- and 
cross-regulating their mRNA levels (44,45). Other 
examples of such regulation are SR protein genes (46), 
polypyrimidine tract binding protein (PTB) (47), 
possibly SUPPRESSOR OF OVEREXPRESSION OF 
COl (48) and riboswitch-regulated alternative splicing 
controlling NMD (49). 

The definition of the rules of NMD in plants have 
mainly relied on mutations in a small number of model 
transcripts and it is necessary to examine how these 
features correspond to those in NMD-sensitive endogen- 
ous transcripts. As about 78% of alternative transcripts in 
Arabidopsis introduced in-frame PTCs more than 55 nt 
upstream of exon junction, it was speculated that NMD 
is a widespread mechanism for regulating gene expression 
(36), however this has not been experimentally addressed. 
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Figure 1. Analysis of alternatively spliced NMD substrates. (A) Schematic figure of RT-PCR panel analysis (see 'Materials and Methods' section). 
(B) Venn diagram of the number of transcripts which increase significantly in upf mutants and cycloheximide treatment. (C) Venn diagram of the 
number of genes with splice isoforms which increase significantly in upf mutants and cycloheximide treatment. (D) General features of transcripts 
which trigger NMD: (i) long 3'-UTR; (ii) PTC — long 3'-UTR; (iii) splice junction downstream of authentic stop codon (3'-UTR intron); (iv) PTC — 
downstream splice junctions and long 3'-UTR; (v) uORFs in 5'-UTR. Endogenous transcripts regulated by NMD contain long 3'-UTRs, introns in 
the 3'-UTR where the splice junction is >50-55 nt from the authentic stop or UORFs (i, iii and v, respectively). Transcripts which contain PTC in the 
coding region or 5'-UTR (uORF) also generate long 3'-UTRs with or without downstream splice junctions (ii, iv and v, respectively). Exons — open 
boxes; UTRs — black rectangles; thin lines — introns; diagonal lines — splicing events; stop sign — PTC or authentic termination codon. 



Two previous studies performed genome-wide transcrip- 
tome profiling of NMD-defective plants using tiling 
and expression microarrays (31,50) and found that only 
about 1 % of plant protein-coding genes were up-regulated 
in NMD-deficient plants. These arrays have limited ability 
to distinguish AS transcripts in contrast to the splicing- 
sensitive microarrays successfully used in animals (51-53), 
and it is therefore necessary to thoroughly investigate 
the fate and characteristics of endogenous plant AS 
transcripts turned over by NMD. 

In the absence of splicing-sensitive microarrays for 
plants, we have used a high-resolution RT-PCR system 
(54) which is able to detect multiple AS transcript isoforms 
simultaneously and obtained isoform-level measurements 
from strong, but still viable mutant alleles (26,27) of the 
NMD protein genes, UPF1 and UPF3, and cycloheximide 
(an inhibitor of translation and thus of NMD) treated 
plants. To address the link between AS and NMD we 
(i) investigated the effect of NMD impairment on a popu- 
lation (~950) of endogenous alternatively spliced tran- 
scripts, (ii) identified NMD-sensitive AS isoforms from 
significant changes in the ratio of AS isoforms, 
(iii) identified the characteristics of AS transcripts which 



trigger NMD, and (iv) identified transcripts which contain 
NMD features but which are insensitive to NMD. Our 
results demonstrate that alternative splicing and NMD 
affect a broad range of different genes in Arabidopsis 
and regulate expression of these genes via targeted degrad- 
ation of specific AS transcripts using different mechanisms 
depending on the position of the AS event in the gene. 

MATERIALS AND METHODS 

Plant material, growth conditions, treatments and 
RNA isolation 

Wild-type (ecotype Col-0) and UPF mutant Arabidopsis 
plants were used for the analysis. UPF mutants, upfl-5 
and upf3-l, (26) were a gift from Brendan Davies 
(Centre for Plant Sciences, University of Leeds, UK). 
Plants were grown in vitro on plates containing germin- 
ation medium (55). Plants were maintained in 16-h light/ 
8-h dark cycle at 22°C. Three week old plants were 
transferred into liquid half-strength Murashige and 
Skoog medium (56) and infiltrated with either 20 uM 
cycloheximide or the same volume of dimethylsulfoxide 
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as a control. Samples of cycloheximide-treated plants 
were collected after 5h (27). RNA was isolated using 
RNeasy Plant Mini Kit (Qiagen). 

High-resolution alternative splicing RT-PCR panel and 
data analysis 

The original panel (54) was expanded to 289 primer pairs 
by identifying alternative splicing events which were either 
published, annotated in The Arabidopsis Information 
Resource (TAIR8 — http://www.arabidopsis.org/) or in 
the Alternative Splicing in Plants database (ASIP — 
http://www.plantgdb.org/ASIP/). Primer pairs where one 
primer is fluorescently labelled were designed as described 
previously (54). Primer pairs used are listed in 
Supplementary Table SI. RT-PCR analysis was per- 
formed as described earlier (54). In brief, the reverse tran- 
scription reaction was carried out with total RNA using 
oligo-dT primers and the first-strand cDNA was aliquoted 
into microtitre plates, and PCR with the gene/alternative 
splicing event-specific primers performed using 24 cycles. 
We have previously shown that 24 cycles was still in the 
linear amplification range for various splicing substrates 
using [ 32 P]-labelling (57) and for a number of the AS 
primers used here to amplify transcripts of different abun- 
dance and size (54). The high-resolution RT-PCR system 
is capable of detecting multiple different AS transcripts 
from a gene, distinguishing alternative splicing events 
involving small size differences in transcripts (as few as 
2-3 nt) and identifying small but significant changes 
in the ratios of alternatively spliced variants. The AS 
variants for each of the genes are amplified simultaneously 
by the same primers in the same reaction. The different AS 
isoforms usually have substantial common sequence 
which will reduce variation in amplification efficiency. In 
addition, if there are differences in amplification efficiency 
among particular AS isoforms, these differences will occur 
in the PCR reactions with wild-type, mutants and 
cycloheximide treatment. Electropherograms produced 
by the ABI 3730 genotyping software identified the exact 
size of the RT-PCR products for each primer pair. Peak 
areas for each RT-PCR product were extracted from the 
three reps, ratios of the different peaks were calculated 
generating a mean and standard error for each AS tran- 
script as a percentage of the total transcript across the 
three reps. 

Statistical analysis 

The response to genotype, treatment and genotype 
by treatment interaction was assessed by analysis of 
variance (ANOVA). Each peak of each primer was 
analysed separately assuming a completely randomized 
design with three replicate values for each treatment com- 
bination. Response was measured as the percentage con- 
tribution of a particular isoform to the total transcripts 
measured and ANOVA was carried out after an angular 
transformation of the percentage values. In addition to 
assessing the significance of genotype and treatment 
main effects and their interaction, three specific com- 
parisons (contrasts) were made: wild-type versus 
upfl-5, wild-type versus upf'3-1 and wild-type versus 



cycloheximide treatment. Residual plots were used to 
monitor the ANOVA assumptions of approximate nor- 
mality and equality of variance. For the small number 
of cases where these assumptions did not hold either the 
response levels were all very low (or all very high) or the 
differences between treatments were so large as to render 
the ANOVA redundant. 

In the analysis of the direct comparisons (above and 
Table 1) /"-values were determined. In the subsequent 
analysis, we focussed on those transcripts which showed 
a significant percentage increase or decrease with a 3% 
difference between the means of wild-type plants and 
mutants/cycloheximide-treated plants. This level of differ- 
ence was selected because we previously determined that 
when comparing variation in technical reps in the AS 
RT-PCR system, the majority of transcripts showed a 
standard error of the mean of <3% (54). 

Sequencing analysis of AS RT-PCR products 

Many RT-PCR products corresponded to unknown splice 
variants. To identify these products, RT-PCR reactions 
were purified using Agencourt AMPure beads (Beckman 
Coulter Genomics) and re-amplified prior to cloning into 
pGEM-T. Clones with differently sized inserts were 
identified by colony PCR and sequenced by standard 
procedures. Sequences were analysed either by using 
ClustalW or spliced alignments generated by GeneSeqer 
(http://www.plantgdb.org/tool/GeneSeqer/) (58). An 
in-house Perl script was used to parse the output from 
GeneSeqer for categorizing the annotations of alternative 
splicing events. 

RESULTS 

Analysis of alternative splicing coupled to nonsense- 
mediated decay using a high-resolution RT-PCR panel 

To analyse the levels of alternatively spliced isoforms in 
mutants in the NMD protein genes, UPF1 and UPF3, 
and to investigate the link between AS and NMD in 
Arabidopsis thaliana, we have exploited an alternative 
splicing RT-PCR panel. This unique high-resolution 
system is very sensitive and capable of detecting small 
but significant changes in alternative splicing at single 
nucleotide resolution (54) (Figure 1A). It detects different 
AS variants from the same gene/region simultaneously, 
variants containing more than one AS event and novel 
AS transcripts (Supplementary Figure SI). The AS RT- 
PCR panel used here consists of 289 primer pairs covering 
different alternative splicing events in 270 genes, and 
3 control primer pairs (Supplementary Table SI). The 
AS events were selected from publications or from plant 
alternative splicing databases without prior knowledge 
of NMD-sensitivity with the exception of GRP7 and 
GRP8 (Figure 2). The AS events were mainly from 
genes encoding transcription factors, RNA-interacting 
proteins (including splicing factors) and stress-related 
proteins and included many important regulatory genes. 
The relative levels of AS isoforms were compared between 
wild-type, the two upf mutants and cycloheximide treat- 
ment (see Figures 2 and 3 and Table 1 for examples). The 
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Table 1. Selected AS transcripts which increase significantly in upf mutants and/or cycloheximide treatment 
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Figure 2. Regulation of GRP7 and GRP8 by alternative splicing and NMD. (A) GRP7 and (B) GRP8 are known to be regulated by AS/NMD. 
Figures show the GRP7 and GRP8 gene and transcript structures and the alternative splicing events in the introns: alternative 5' splice sites generate 
AS isoforms which increase in abundance in the upfl-5 and upj'3-1 mutants as illustrated on scans generated from the ABI 3730 data by GeneMapper 
(transcripts are arrowed). The significant increases in NMD-sensitive transcript abundance are shown in histograms of the ratio of normally spliced 
and alternatively spliced isoforms (shaded). Significance: ***p< 0.01; **0.01 >/ > <0.05. For diagram key see legend to Figure 1. 



upfl-5 and upf3-l mutants are impaired in NMD and have 
severe growth phenotypes but are viable (26,27). 
Translation is required for NMD (22,59) and the transla- 
tion inhibitor, cycloheximide, leads to accumulation of 
NMD-sensitive transcripts in plants (27). Therefore a 
5 hour cycloheximide treatment was used to inhibit trans- 
lation and thereby NMD (27). Consequently, AS isoforms 
which are targets of NMD are expected to increase in their 
levels in the upf mutants and cycloheximide-treated plants. 
Three biological replicates were analysed for each, and 
significant changes in alternative splicing ratios were 
determined by statistical analysis (see 'Materials and 
Methods' section). 

High frequency of novel alternatively spliced transcripts 
in regulatory plant genes 

Based on publications or plant databases the majority of 
AS events were expected to generate two alternative 
transcripts. However, the number of observed RT-PCR 
products varied among the different amplified regions 
from a single product to as many as 15 different alterna- 
tively spliced products. Just over 950 RT-PCR products 
were observed using the 289 primer pairs and therefore 
approximately 350 new transcripts were discovered. 
To identify the nature of the novel AS transcripts, 



cloning and sequencing of RT-PCR products was 
carried out (for examples of analysis see Supplementary 
Figure SI). In addition, many were identified by RNA-Seq 
(our unpublished data). In general, our data shows an 
increase of AS frequency in our gene set by one third 
compared to presently annotated events. The identifica- 
tion of so many novel products illustrates that far more 
alternative splicing occurs in Arabidopsis than is currently 
known. 

Identification of endogenous alternatively spliced targets 
of nonsense-mediated decay 

Our high-resolution RT-PCR system allowed the deter- 
mination of the ratio of AS transcript variants for each 
gene region amplified. Mean ratios of AS products 
obtained with the upf mutants and cycloheximide treat- 
ment were compared to the wild-type values to identify 
the particular AS isoforms which increased significantly 
when NMD is impaired. Significance was determined at 
the P < 0.1 level, although for the vast majority significant 
changes in AS isoform levels, the P-value was consider- 
ably smaller (Supplementary Table S2). Of the >950 tran- 
scripts in the study, 638 showed no significant change in 
the transcript ratios between wild-type, mutants or 
cycloheximide-treated plants. Of the 313 RT-PCR 
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Figure 3. Genes with AS isoforms which increase in upf mutants. (A) At4g25500 — SR protein gene, At-RS40. (B) At4g33060 — cyclophilin 57, 
CYP57. (C) At4g02200 — drought-induced protein gene, At-Dil9-1. For all, the gene and transcript structures and relevant splicing events are 
shown. AS isoforms which increase in the upf mutants are labelled with arrows on ABI3730 scans and the ratios of transcripts are shown in 
histograms and significant increases are indicated. Significance: ***P<0.01; **0.01 >P<0.05. For diagram key see legend to Figure 1. 



products that showed a significant change, 165 increased 
in amount in at least one of the upf mutants or 
cycloheximide treatments (Figure IB; Supplementary 
Table S2). Thirty-three transcripts increased in both 
mutants and cycloheximide-treated plants (Figure IB). 
A total of 106 transcripts were increased in one or other 
mutant while 59 showed a significant increase only in the 
cycloheximide-treated plants. Cycloheximide is used 
widely as an NMD inhibitor but as cycloheximide is 
a general translational inhibitor, other RNA degradation 
pathways or cellular processes might also be affected by 
this treatment and impact on transcript levels. Therefore, 
the 106 transcripts which increased in the upf mutants and 
the 165 which increased in mutants and cycloheximide 
treatment represent a range of naturally occurring 
alternatively spliced transcripts which are putatively 
turned over by NMD and make up 11-17% of the total 
transcripts analysed and 16-25% of the alternatively 
spliced transcripts analysed. 

We found that 87 and 121 genes of the 270 AS genes on 
the panel (Figure 1C; Supplementary Table S3) had at least 
one AS isoform with increased abundance in the upf 
mutants or in the mutants plus CHX treatment, respect- 
ively, suggesting that ~32% and 45% of AS genes 
are regulated by NMD to some extent. At least 42% 
(9273 genes out of 22 302) of intron-containing genes in 
Arabidopsis are alternatively spliced (36). With the caveat 
that our gene set may contain some bias, we can extrapo- 
late to suggest that around 13-18% of intron-containing 
genes may be regulated by AS and NMD in Arabidopsis. 

GRP7 and GRP8 are genes encoding components of 
a slave oscillator and are known to be auto- and 
cross-regulated by alternative splicing and NMD (44,45) 
and were included as controls. For both genes, the AS 
isoform which is turned over by NMD increased 



significantly as expected. The GRP7 isoform increased sig- 
nificantly in both mutants and cycloheximide treatment 
(Figure 2A; Table 1, primer pair 206) and significant 
increases in the GRP8 isoform were observed in upf'3-1 
and cycloheximide treatment (Figure 2B; Table 1, primer 
pair 90). These results demonstrate that the AS RT-PCR 
panel is able to detect significant changes in AS isoforms 
due to NMD. Other examples of AS/NMD transcripts 
are shown in Figure 3. At-RS4(), an SR protein gene, 
CYP57, a peptidyl-prolyl cis-trans isomerase gene, and 
the drought-induced protein 19-like 1 gene, At-Dil9-5, 
have alternative splicing events which introduce PTCs 
and increase significantly in mutants and cycloheximide 
treated plants. In general, the increase in levels of the 
NMD-sensitive AS isoforms in the mutants and after 
cycloheximide treatment varied with different genes. 
Interestingly, the steady state levels of the AS transcripts 
turned over by NMD varied greatly in wild-type plants 
from being virtually undetectable to tens of percent of 
the transcripts from a gene (Figure 3B and C; Table 1 
and Supplementary Table S2). Thus, AS/NMD transcripts 
from different genes are differentially abundant in 
wild-type plants which reflects the different efficiency of 
alternative splicing and rates of turnover by NMD. This 
analysis shows that coupled AS/NMD can influence gene 
expression significantly and that even low abundant alter- 
native transcripts (at steady state level) might turn over 
a quite significant proportion of the RNA produced 
from a gene. 

upfl-5, upf3-l and cycloheximide treatment impair 
nonsense-mediated decay to different degrees 

More transcripts (102 transcripts) increased significantly 
in the upf'3-1 mutant than in upfl-5 (47 transcripts) 
while 136 transcripts increased significantly in the 
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cycloheximide-treated plants (Figure IB). This suggests 
that in terms of NMD impairment, the upf'3-1 allele is 
stronger than the upfl-5 allele which conforms to the 
severity of the phenotype of the particular mutants 
which was described previously (26,27,31) and that 
UPF3 transcripts are up-regulated in upfl-5 (60). In 
addition to the described late flowering phenotype, we 
have observed that both upfl-5 and upf3-l mutants have 
accelerated senescence and again upf3-l showed the 
stronger phenotype (Supplementary Figure S2). Some 
transcripts only increased with cycloheximide treatment 
and were not visible in the wild-type or mutant plants. 
Thus, cycloheximide has a much stronger effect on the 
number of transcripts which increased in abundance and 
on the degree of increase than upf3-l with the smallest 
increases being seen in upfl-5 (Supplementary Tables S4 
and S5). Some transcripts did not follow this pattern 
suggesting perhaps differential or additional functions of 
the two different NMD factors in different mechanisms 
of NMD (28). There was substantial overlap between 
transcripts which increased in abundance in cycloheximide 
treatment and in the mutants (Figure IB). In addition, 
we analysed 11 transcripts which increased only in the 
cycloheximide treatment in detail and the majority 
showed NMD characteristics (PTCs/downstream splice 
junctions or uORFs) (Supplementary Table S2). This 
suggests that AS transcripts which increase in the 
cycloheximide treatment are turned over by NMD with 
the caveat that some transcripts may be turned over by 
other degradation pathways. 

Features of alternatively spliced transcripts which are 
sensitive to nonsense-mediated decay 

Previously, 1% of plant protein-coding genes was shown 
to have increased transcript levels in NMD-deficient 
plants using tiling arrays but the features of the AS tran- 
scripts which were NMD targets were not specifically 
investigated (50). The 165 endogenous AS isoforms 
which show significant increases in levels in the mutants 
and cycloheximide-treated plants (Table 1 ; Supplementary 
Table S2) are expected to be turned over by NMD and 
therefore should contain NMD signals. To identify the 
NMD features, each transcript which increased signifi- 
cantly in at least one of the mutants and 1 1 of the tran- 
scripts which increased only in cycloheximide-treated 
plants (117 transcripts in total) was characterized in 
terms of whether they contained PTCs, had splice junc- 
tions downstream of the authentic stop codon or PTCs, 
had long 3'-UTR sequences or contained an upstream 
ORF (Figure ID). In addition, other AS events in the 
same gene (annotated in TAIR8) and novel RT-PCR 
products for which sequence was generated were 
analysed. Of the 117 transcripts, the sequences of 8 
remain elusive and could not be characterized. Of the 
remaining 109 transcripts, 94 (86%) clearly contained 
NMD features. This includes a major group of 74 tran- 
scripts containing PTCs more than 50-55 nt upstream of 
splice junctions and long 3'UTRs, classical features of 
NMD substrates (61). The NMD-sensitivity of a further 
nine transcripts could be explained on the basis of long 



3'-UTRs and/or where the distance between the authentic 
stop codon and splice junction of an intron in the 3'-UTR 
was changed due to alternative splicing. In addition, the 
alternative splicing events of 12 genes (14 transcripts) 
involved introns in the 5'-UTR which affected the 
presence or absence, length and position of uORFs. 
In 11 of these transcripts, the presence of one or more 
uORFs correlated with NMD. Finally, for the remaining 
12 transcripts, the AS event monitored did not explain the 
turnover by NMD but of these, four genes had known AS 
events elsewhere in the transcripts which could generate 
NMD. The remaining transcripts could either be genes/ 
transcripts which have unknown NMD-inducing AS 
events elsewhere in the gene or may represent genes 
where changes in AS isoform levels are due to secondary 
effects on mRNA accumulation. Therefore, the vast 
majority of the endogenous NMD-sensitive transcripts 
analysed had characteristic features of NMD substrates 
in plants. 

Alternative splicing in 3'-UTRs modulates 
nonsense-mediated decay 

It is not widely appreciated that alternative splicing in the 
untranslated regions of a gene (which does not create a 
PTC) may induce NMD sensitivity and be a mechanism 
of fine regulation of transcript abundance. Also, little is 
known about the consequences of AS of such UTR 
introns. We have identified two genes in our panel where 
AS occurred in 3'-UTR introns and thus did not create a 
PTC, but at least one AS isoform in each gene was sensi- 
tive to NMD. Alternative splicing in the 3'-UTR of 
At2g38880 (NF-YB1/HAP3A transcription factor) and 
Atlg72560 (PAUSED /Exportin-t) generated two and 
three isoforms, respectively, with different sensitivity to 
NMD (Figure 4). Only the isoforms with a distance 
>50-55nt from the authentic stop codon were subjected 
to NMD (Table 1). Thus, alternative splicing can affect 
the distance between the authentic stop codon and down- 
stream splice junction in 3'-UTRs and determine whether 
a transcript is turned over by NMD. 

The position of PTCs defines the length of 3'-UTRs 
which can trigger nonsense-mediated decay 

Current models for NMD suggest that the length of the 
3'-UTR (distance between the stop codon or PTC and 
3'-end of the transcript — long 3'-UTR) can be one of the 
triggers of NMD (Figure lc). For all of the analysed AS 
transcripts containing PTCs, we analysed the distance 
between the first PTC and the 3' -end of the transcript 
(Figure 5; Supplementary Table S2). These distances 
appeared to show a bimodal distribution with the 
majority (58 transcripts) being longer than 600 nt and 
twenty transcripts (from 16 genes) shorter than 550 nt. 
When these latter transcripts were examined, two-thirds 
were from genes with relatively short coding sequences 
(500-720 bp) and all but five had splice junctions down- 
stream of the PTC such that they conformed to expected 
features of NMD substrates. The five transcripts without a 
downstream splice junction had PTC to 3'-end distances 
of 366, 370, 440 nt and two transcripts had 441 nt and 
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Figure 4. Alternative splicing of introns in the 3'-UTR influences 
turnover of AS isoforms by NMD. Exon-intron structures of genes 
and transcripts of (A) At2g38880 — NF-YB1 transcription factor, 
NF- YBljHAP3a and (B) At\gl2560—PSD/exportw-t. Alternative 
splicing of the 3'-UTR introns in these genes generate transcripts 
with different distances between the authentic stop codon and down- 
stream splice junction consistent with the 50-55 nt rule where distances 
>50-55nt trigger NMD. Stop codon to splice junction distances are 
indicated along with whether the AS isoform is turned over by NMD 
or not. For diagram key see legend to Figure 1. 



assuming that there are no other AS events in the genes 
causing NMD, these could represent 'long 3-UTR 1 tran- 
scripts. We also determined the distance between the au- 
thentic stop codon and 3'-end of the gene in the normally 
spliced transcripts from the same genes. The mean 
distance was 242 nt and the majority of transcripts were 
in the range of 22-350 nt (Figure 5). Eight fully spliced 
transcripts (from 7 genes) had a 3'-UTR of >350nt 
ranging from 354 to 7l8nt. Taken together, these data 
suggest that long 3'-UTRs which trigger NMD in 
Arabidopsis mRNAs are in general >350nt but there are 
exceptions where transcripts with longer 3'-UTRs do not 
show evidence of NMD. 

uORFs overlapping the main start codon induce 
nonsense-mediated decay 

Previous studies in Arabidopsis suggest that uORF can 
trigger NMD (35,60). Preliminary features of such 
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uORFs were defined using model constructs as being the 
first uORF in a transcript, at least 10 nt from the 5'-end 
and longer than 35-50 amino acids (35). However, the 
uORF in AtMHX was only 13 amino acids long and 
affected mRNA levels and translational efficiency (60). 
In addition, the link between AS in 5'UTRs and the 
presence, size, and positions of uORFs which may then 
trigger NMD has never been investigated in plants or 
in endogenous transcripts. In our study, we identified 
12 genes with alternative splicing of 5'-UTR introns 
where transcripts increased significantly in upf mutants 
and/or cycloheximide treated plants. uORFs of between 
3 and 123 amino acids were present in the fifteen different 
AS isoforms of these genes. 

Seven genes had uORFs with the interesting unifying 
feature that the AS isoforms turned over by NMD con- 
tained an uORF which overlapped the translation start 
site of the main ORFs (Figure 6A-C; Supplementary 
Figure S3; Table 1). For example, in the zinc finger 
protein gene (At2g02960), alternative splicing produces 
six different AS transcripts through use of multiple alter- 
native 3' splice sites (Figure 6A). Four of these produce 
a 26 amino acid uORF which overlaps the AUG transla- 
tion start site of the main ORF and all four are NMD 
sensitive. The fully spliced transcript and two shorter 
alternatively spliced isoforms contained short uORFs 
which do not overlap with the AUG. AS in the 5'UTR 
of At3g49430 (At-SR34a) generated three new uORFs of 
13, 30 and 61 amino acids (Figure 6B). The stop codon of 
the 61 amino acid uORF overlapped the AUG of the main 
ORF and this transcript is NMD sensitive. Similarly, 
the fully spliced product of At3g20270 contains a uORF 
of 22 amino acids where the stop codon of the uORF lies 
downstream of the AUG of the main ORF (Figure 6C) 
and is NMD sensitive whereas the other AS transcripts 
without this feature are NMD resistant. Other examples 
of genes showing this phenomenon are shown in 
Supplementary Figure S3. Taken together, our analysis 
shows that many transcripts containing a uORF which 
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Figure 6. Alternative splicing of introns in the 5'-UTR affects the 
presence, size and position of uORFs and influences turnover of AS 
isoforms by NMD. Exon-intron structures of genes and transcripts of 
(A) At2g02960— zinc finger transcription factor, (B) At3g49430 — SR 
protein gene, At-SR34a, and (C) At3g20270 — lipid-binding serum 
glycoprotein gene. These examples illustrate alternative splicing events 
in 5'-UTR introns which generate uORFs which overlap the main ORF 
and correlate with NMD. (A) four of the AS isoforms contain a 26 
amino acid uORF (A-D) which overlaps the translation start site of the 
main ORF and correlates to NMD; (B) an uORF of 61 amino acids 
overlaps the authentic translation start codon in the AS product; and 
(C) uORFl overlaps the translation start of the main coding sequence 
in the fully spliced transcript while in the alternatively spliced isoform 
the stop codon of uORF2 lies upstream of the main translation start 
site. Shaded rectangles below transcripts — uORFs; FS — fully spliced; 
AS — alternatively spliced. Sequences below the figures show the 
relationship between the stop codon of uORFs and the translational 
start AUG of the main ORF. For diagram key see legend to Figure 1 . 



overlaps the authentic start codon are subject to NMD 
and identifies a new feature of uORFs which is capable 
of inducing NMD in plants. The analysis also indicates 
that not all features of uORFs which trigger NMD have 
been resolved as we find examples where there is no cor- 
relation of the presence of uORFs of between 43 and 
92 amino acids and NMD (see 'Discussion' section). 

Splice isoforms with retained introns are not sensitive 
to nonsense-mediated decay 

fntron retention is the most abundant type of alternative 
splicing in plants — 41% (62). In general, due to the 
UA-richness of plant introns, most intron retention 
events create PTC transcripts. These transcripts are con- 
sidered as potential targets of NMD as it has been shown 
in other organisms that transcripts with retained introns 
are turned over by NMD (63). Of the 90 characterized AS 
transcripts which increased in one or both of the upf 
mutants, only four had retained introns (Supplementary 
Table S2) suggesting that IR transcripts were under- 
represented. We therefore examined all of the readily 
detectable intron retention events on our panel where 
the IR transcripts made up at least 2% of the total. Of 
the 29 such IR transcripts, nineteen contained PTCs with 
downstream splice junctions and/or had long 3'-UTRs 
(all >400nt) and therefore have features of typical 
NMD substrates (Table 2; Figures ID and 7). Despite 
containing these NMD signals, these intron retention 
isoforms did not increase in abundance in the upf 
mutants and/or cycloheximide treatment suggesting that 
they are not turned over by the NMD pathway. 
Interestingly, in a number of these genes, other alterna- 
tively spliced transcripts were produced which contained 
PTCs and were subject to NMD. In two cases in par- 
ticular, the other AS events involved the same in- 
tron as the retained intron (Figure 7 A and B). In 
At5g37055 (SEF— SERRATED LEAVES AND EARLY 
FLOWERING), there are three different intron retention 
transcripts involving introns 1 and 2, all containing PTCs, 
none of which is a target of NMD (Figure 7C). However, 
use of an alternative 3' splice site in exon 3 generates a 
PTC+ transcript which is turned over by NMD (Figure 
7C and Table 2). A further example is At5g24270, coding 
for SO S3— SALT OVERLY SENSITIVE 3, a 
calcineurin-like protein, where retention of intron 5 was 
NMD resistant while use of an alternative 5' splice site in 
the next intron, also creating a PTC, significantly up 
regulated the latter transcript in upf3-l and 
cycloheximide-treated plants (Figure 7D and Table 2). 
The first PTC generated in these two transcripts are only 
30 nt apart (Figure 7D) arguing against a position effect of 
the PTC. Thus, transcripts from the same gene which 
generate PTCs in very similar positions either through al- 
ternative splicing or intron retention events can be differ- 
entially sensitive to NMD. It appears that if a transcript is 
generated where an intron has not been spliced (retained 
intron) then the transcript is not NMD sensitive. 

Besides the above PTC+ NMD-insensitive events, six 
other IR transcripts were effectively PTC- and were not 
targets of NMD (Supplementary Table S6). In these 
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Table 2. PTC + , NMD-insensitive intron retention transcripts and NMD-sensitive AS transcripts from the same genes 
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transcripts, the introns were either (i) in frame (no PTC), 
(ii) towards the end of the transcript such that the PTC 
was close to the authentic stop and would lead to a change 
in C-terminal sequence, (hi) in the 5'-UTR with uORFs 
which do not trigger NMD, or (iv) there was no evidence 
of an intron at the suggested position in the transcript. 

Only four intron retention transcripts increased sig- 
nificantly in the mutants and/or cycloheximide treatment 



(Supplementary Tables S2 and S6). One of these 
(At4g36960) retained intron 1 in the 5'UTR which 
generated an uORF overlapping the authentic translation 
start site and therefore is expected to trigger NMD. In the 
other cases, potential NMD-causing AS events may 
occur elsewhere in the gene. 

In conclusion, our analysis suggests that plant tran- 
scripts with retained introns are usually not targets for 
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Figure 7. Intron retention transcripts are not turned over by NMD. Schematic figures of genes which produce detectable intron retention transcripts 
and other alternatively spliced transcripts with different NMD phenotypes: (A) Atlg76460; (B) Atlg49730; (C) At5g37055 and (D) At5g24270. Below 
each gene, structures of transcripts in the amplified region are shown and display different alternative splicing events. Data for each transcript is 
shown alongside (— : no change; +, ++ and +++: transcript level increases significantly in upf mutants with P<0.1; 0.01 >/ > <0.05 and P<0.01, 
respectively. For diagram key see legend to Figure 1. Grey lines below introns labelled IR — retained introns; FS — fully spliced; AS — alternatively 
spliced. 



NMD provided there is no other alternative splicing event 
which produces features for NMD in the transcript. 



DISCUSSION 

Alternative splicing is a major determinant in the produc- 
tion of variant mRNA transcripts some of which contain 
PTCs and might be targeted by NMD. This pathway has 
a significant impact on the expression of genes involved 
in plant development and adaptation [reviewed (24,25)]. 
This raises the important questions of how frequently the 
expression of plant genes is regulated by coupled AS/ 
NMD and what are the structural features of endogenous 
NMD substrates. Using a high-resolution RT-PCR 
system we have examined a large population of 950 
endogenous transcripts from 270 genes and have 
characterized over 100 NMD-sensitive AS transcripts in 
detail. This represents the most extensive and accurate 
analysis of AS and AS/NMD in endogenous transcripts 
in plants. We demonstrate (i) a previously unknown 
high overall prevalence of AS and AS/NMD; (ii) that 
NMD-sensitive transcripts are readily detected in 
wild-type plants often representing substantial propor- 
tions of the total transcripts of a gene; (hi) that AS in 
5'-UTRs and 3'-UTRs regulates transcript levels by ren- 
dering them NMD sensitive; (iv) that uORFs overlapping 



the start codon can trigger NMD; and (v) that transcripts 
with intron retention events in plants do not trigger NMD 
even though they possess classical features inducing 
NMD. 

Coupling AS to NMD is a frequent event in plant 
gene expression 

In the course of our analysis we have discovered an 
unexpectedly high number of novel AS transcripts. This 
follows from the sensitivity of the RT-PCR system which 
can detect transcripts of < 1 % of the total transcripts of a 
gene; on the other hand, many of the novel transcripts 
were abundant but not represented in databases. Thus, 
clearly much more AS is occurring in Arabidopsis than is 
currently estimated and annotated, especially considering 
that we have assessed AS/NMD only in plants at one 
developmental stage (3 week old) and have not taken 
into account many developmental stage-, tissue- or 
condition-specific AS events. 

Genome-wide analyses in eukaryotes have shown that 
up to 20% of the transcriptome can be affected by NMD 
[reviewed in (12,53,64)] and that around 20-30% of alter- 
natively spliced transcripts in humans contain PTCs and 
are potential targets of NMD (37). In plants, little 
is known about the contribution of NMD to regulation 
of gene expression. A recent genome-wide tiling array 
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analysis in Arabidopsis found that only around 1% of 
plant protein-coding genes were affected (50) while 
an RNA-Seq analysis which showed that 42% of 
Arabidopsis genes underwent AS predicted that around 
78% of AS isoforms could be putative targets of NMD 
(36). By comparing our AS/NMD gene set to those 
detected using expression/tiling arrays (31,50) we found 
only two genes in common (data not shown) and, in 
addition, known AS/NMD substrates such as GRP7 and 
GRP8 or SR genes (45,46) were not detected using expres- 
sion or tiling arrays (31,50). Similarly, very little overlap in 
NMD-affected gene sets was found in Drosophila when 
comparing expression microarray and splicing-sensitive 
array results (53). This discrepancy is most likely due to 
the sensitivity and resolution of the AS RT-PCR panel 
which is able to detect significant changes in individual 
transcript levels which would not be detected in micro- 
array experiments where the usual cut-off is > 1.5-2-fold. 

In this study, 11-17% of the total number of transcripts 
and 16-25% of the alternatively spliced transcripts 
analysed were potential NMD substrates suggesting that 
about 32-44.8% of AS genes are regulated by NMD. 
Extrapolating from these values and the estimate of 
the frequency of AS (36), about 13-18% of Arabidopsis 
intron-containing genes are potentially regulated by 
AS/NMD. This compares well to the 14% and 20% 
reported for Drosophila and Caenorhabiditis elegans 
(64,65). 

Features of NMD-sensitive transcripts in plants 

Rules for NMD in plants have been established based on 
the behaviour of a small number of genes or artificial 
constructs (28,32-35). While the general principles of 
intron-based and long 3'-UTR dependent NMD have 
been described (Figure 1C), little investigation of this 
behaviour in endogenous NMD-sensitive transcripts has 
been performed until now. Here, we determined the 
features of individual AS transcripts and found that the 
majority of AS/NMD transcripts (~85%) contained 
PTCs with downstream splice junctions and/or long 
3'-UTRs and therefore comply with existing NMD rules. 
In plants the average length of the 3'-UTR is 241 nt (66) 
and our results show that a 'long 3'-UTR' capable of trig- 
gering NMD in Arabidopsis is usually >350nt. A similar 
estimate was obtained using NMD-test constructs where 
instead of the 3'-UTR length, the distance between a PTC 
and the authentic stop codon was defined previously as 
around 300 nt (34). In addition, however, we identified a 
number of exceptions where transcripts with 3'-UTRs 
>350nt were not turned over by NMD suggesting that 
additional yet unidentified features are involved in trigger- 
ing NMD. 

Our results demonstrate that alternative splicing of 
introns in either the 3'-UTR or 5'-UTR can determine 
whether transcripts of endogenous genes are targets of 
NMD or not and thereby regulate transcript levels. We 
identified two genes where AS in 3'-UTR introns rendered 
AS transcripts NMD-sensitive by increasing the distance 
between the authentic stop codon and the splice junction 
to more than 50 nt. This agrees with the rules for EJC 



complexes trigging NMD (Figure ID). Interestingly, 
one of the genes showing regulation by AS/NMD in a 
3'-UTR intron is the NF-YB1 transcription factor 
subunit involved in photoperiod-regulated flowering and 
in drought stress responses, and whose over-expression 
leads to increased drought resistance (67). 

Alternative splicing of introns in 5'-UTRs can change 
the length and sequence of the 5'-UTR or remove the 
authentic AUG (again altering the 5'-UTR) and thereby 
affect the presence/absence, number, size and position of 
uORFs. In human, polymorphisms or mutations which 
create or remove uORFs can suppress mRNA and 
protein levels and cause disease (68). uORFs in 5'-UTR 
regions can affect gene expression by different mechan- 
isms: encoding an active peptide, affecting translational 
efficiency or reducing transcript levels by triggering 
NMD (68-70). In eukaryotes, ribosomes generally load 
onto mRNAs at the 5'-end and scan to the first AUG 
translation start codon. If an uORF is translated, the 
uORF stop codon might be recognized as a PTC (with 
the additional features of creating a long 3'-UTR and 
the high likelihood of downstream splice junctions) and 
thereby targeting the transcript to the NMD pathway. 
Around 20% of plant genes contain uORFs (35,71) but 
their fate in terms of whether they are translated or 
scanned through, trigger NMD or allow re-initiation of 
translation is not known. We found a strong correlation 
between presence of an uORF which overlapped the AUG 
of the main ORF and NMD most likely triggered by 
generating a long 3'-UTR and downstream splice junc- 
tions. We also found other genes where the presence of 
'fully upstream' uORFs correlated with activation of 
NMD which indicates inefficient reinitiation of translation 
of their main ORFs. However, other AS transcripts con- 
tained short uORFs and/or 'long' uORFs (e.g. 43, 55 and 
92 amino acids) which did not trigger NMD. Thus, 
the factors which determine whether or not particular 
uORFs activate NMD are clearly complex and poorly 
understood. With around 20% of Arabidopsis genes 
containing uORFs and the frequent occurrence of AS in 
5'-UTRs, AS/NMD involving uORFs is likely to be 
important in regulation of expression of many plant genes. 

Retained introns do not trigger NMD 

Besides identifying NMD-sensitive AS transcripts, we also 
identified AS transcripts which contained NMD signals 
but which were immune to NMD. A comprehensive 
analysis of AS/NMD in mammalian tissues indicated 
that not all characteristics of NMD-targeted RNAs have 
been identified and that not all RNAs containing known 
NMD features are in fact turned over by NMD (52). 
Surprisingly, we found that the majority of intron reten- 
tion transcripts which we analysed were not turned over 
by NMD despite containing PTCs, downstream splice 
junctions and long 3'-UTRs. However, transcripts from 
the same gene with other types of alternative splicing 
events in the same or nearby intron which generated 
PTCs in very similar positions were sensitive to NMD. 
This unexpected finding is in contrast to current assump- 
tions that plant transcripts with retained introns and PTCs 
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are subject to NMD as such transcripts have been found 
on ribosomes (72,73), a prerequisite for NMD. More 
importantly, in other organisms transcripts with retained 
introns and PTCs are subjected to NMD suggesting a dif- 
ferent strategy in plants as intron retention transcripts 
avoid the NMD machinery and have a different fate 
[see below; (63,74)]. In addition, our clear demonstration 
that intron retention events which create a PTC are NMD 
insensitive may explain the high frequency of IR events 
identified in plants where it constitutes the major AS 
event. 

We previously detected aberrant mRNAs in the nucle- 
olus of Arabidopsis of which around 80% were intron 
retention events (75). We also found UPF2 and UPF3 to 
localize to the nucleolus and hypothesized that this may be 
the site of assembly of NMD factors onto aberrant 
mRNAs prior to NMD. From the data presented here, 
IR transcripts avoid NMD and their accumulation in 
the nucleolus may therefore have a different function. 
Although the current model of NMD in mammals is 
that PTCs are recognized in the pioneer round of transla- 
tion as the mRNA exits the nuclear pore, it is not clear 
whether this model applies to plants. However, if transla- 
tion is required for NMD, one possible explanation is 
that transcripts containing introns or intron fragments 
are recognized as aberrant prior to export by virtue of 
proteins binding to the UA-rich intron sequences (75) 
and therefore do not connect with the NMD machinery. 
Further research will be needed to determine the fate 
of different plant PTC-containing transcripts in terms of 
their intranuclear and intracellular localization, dynamics 
and transport and how and why intron retention 
transcripts escape NMD in plants. 

Significance of AS/NMD for plant gene expression 
pattern 

The endogenous NMD-sensitive transcripts showed great 
variation in their steady state levels (i.e. levels detectable in 
wild-type plants) and in the degree of increased abundance 
in the different mutants and cycloheximide treatment. 
This variation is likely to reflect gene-specific differences 
in transcription levels, frequency of AS producing the 
different isoforms, or tissue-specific AS occurring only in 
particular organs or cell types. In addition, the 
transcript-specific efficiency of NMD turnover could 
reflect features of the transcripts such as position of 
PTC and downstream splice junctions, length of 3'-UTR 
and RNA secondary structure. Importantly, for some 
genes, non-productive mRNAs (PTC-containing; unable 
to produce full-length protein) which includes transcripts 
targeted by NMD and transcripts which are NMD- 
insensitive such as intron retention transcripts, form a 
significant proportion of steady state levels of transcripts 
in wild-type plants. One consequence is that traditional 
expression microarrays are unable to distinguish between 
productive and non-productive mRNAs and therefore 
functional transcript abundance for these genes is 
over-estimated. Clearly, alternative splicing information 
must be integrated with transcriptional data to provide 
true measures of gene expression. 



The coupling of AS and NMD is an important general 
mechanism in gene expression regulation. It modulates the 
relative levels of mRNA isoforms from a gene which are 
either productive (protein-coding) or unproductive AS 
variants and thereby regulates protein levels. Recent 
examples of plant genes regulated or putatively regulated 
by AS/NMD in plants are GRP7/8 and SOC1 (involved in 
the circadian clock and flowering control, respectively), 
SR and PTB protein splicing factors (involved in a 
range of developmental and stress response processes) 
and HSF2A (a heat shock factor) (43-48,76). Here, 
despite only around 270 genes being analysed, AS/NMD 
has been identified in 121 genes (Table 1; Supplementary 
Table S3) playing central roles in cellular processes: tran- 
scription factors, splicing factors, RNA-binding proteins, 
RNA helicases, spliceosome and exon junction complex 
proteins, tRNA export, signal recognition particle 
and ribosomal proteins. Components of developmental 
pathways also show NMD-mediated turnover of AS tran- 
scripts, for example, different MAF genes and VRN2 
(flowering time) and CCA1 and PRR9 (core circadian 
clock). Finally, a number of genes involved in signalling 
and stress response pathways undergo AS/NMD: the 
calcium-dependent salt stress signalling pathway protein 
genes SOS2 and SOS3, phosphatases and kinases (e.g. the 
SNFl-like protein kinase, AtKINll) and various tempera- 
ture, drought and salt response factors (e.g. SRF2, 
HSF2A). The identification of many genes in a wide 
range of processes and pathways suggests that AS/NMD 
is a widespread regulatory mechanism in plants. 

ACCESSION NUMBERS 

The Arabidopsis Genome Initiative numbers for UPF1 
and UPF3 are At5g47010 and Atlg33980. AGI locus iden- 
tifiers of genes analyzed in this article are listed in 
Supplementary Table SI. 
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